fix: correct MultiTurnSample user_input validation logic #2426
+29
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixed validation bug where generator expression was not being evaluated.
Changed from checking generator object to using all() to properly validate
all messages are instances of HumanMessage, AIMessage, or ToolMessage.
Added tests to verify validation works correctly.
Problem Description
Problem: The
MultiTurnSample.validate_user_input()method had a critical validation bug where the generator expression was not being properly evaluated. The code was checkingif not (isinstance(m, ...) for m in messages):which creates a generator object that is always truthy, causing the validation to never trigger.Impact: This meant that invalid message types could potentially pass validation if they somehow bypassed Pydantic's type checking, though in practice Pydantic's Union validation catches most cases before this validator runs. However, the validator logic itself was fundamentally broken and would not work correctly if called.
How to replicate: The bug can be seen in the code at
src/ragas/dataset_schema.py:131-133where the generator expression withoutall()would never properly validate the messages.Changes Made
src/ragas/dataset_schema.py: Changedif not (isinstance(m, ...) for m in messages):toif not all(isinstance(m, ...) for m in messages):to properly evaluate all message type checkstests/unit/test_dataset_schema.py:test_multiturn_sample_validate_user_input_invalid_type(): Verifies that invalid message types are properly rejectedtest_multiturn_sample_validate_user_input_valid_types(): Verifies that valid message types are properly acceptedReferences
src/ragas/dataset_schema.py(line 131-133)tests/unit/test_dataset_schema.py(lines 201-226)MultiTurnSampleclass which is used throughout the codebase for multi-turn conversation evaluation